Posters - Schedules
Poster presentations at ISMB/ECCB 2021 will be presented virtually. Authors will pre-record their poster talk (5-7
minutes) and will upload it to the virtual conference platform site along with a PDF of their poster beginning July 19
and no later than July 23. All registered conference participants will have access to the poster and presentation
through the conference and content until October 31, 2021. There are Q&A opportunities through a chat
function and poster presenters can schedule small group discussions with up to 15 delegates during the conference.
Information on preparing your poster and poster talk are available at:
https://www.iscb.org/ismbeccb2021-general/presenterinfo#posters
Ideally authors should be available for interactive chat during the times noted below:
View Posters By Category
Session A: Sunday, July 25 between 15:20 - 16:20 UTC |
Session B: Monday, July 26 between 15:20 - 16:20 UTC |
---|---|
Session C: Tuesday, July 27 between 15:20 - 16:20 UTC |
Session D: Wednesday, July 28 between 15:20 - 16:20 UTC |
---|---|
Session E: Thursday, July 29 between 15:20 - 16:20 UTC |
---|
Short Abstract: The spatial transcriptomics procedure enables to gain an insight not only into the level of gene activity, but also enables to map this activity spatially. It is possible due to the fact that, unlike single cell RNA-seq experiments, spatial transcriptomics (ST) retains information on cells’ position within the tissue. However, ST spots contain multiple cells, therefore the observed signal inevitably includes information about mixtures of cells of different types. In order to deconvolute the aforementioned mixtures and infer the spatial cell types composition, various methods combining the two complementary technologies: ST and single cell RNA-seq have been proposed. Unfavourably, these methods require both types of data and may be prone to bias due to platform-specific effects, such as sequencing depth. To address those issues, we present an innovative approach that does not require single cell data, but instead needs additional prior knowledge on marker genes. Our novel probabilistic model for cell type deconvolution in ST data called Celloscope, was applied on mouse brain data and was able to successfully indicate brain structures and spatially distinguish between two main neuron types: inhibitory and excitatory.
Short Abstract: Multi-modal profiling of single cells represents one of the latest technological advancements in molecular biology. Among various single-cell multi-modal strategies, cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) allows simultaneous quantification of two distinct species: RNA and cell-surface proteins. Here, we introduce CiteFuse, a streamlined package consisting of a suite of tools for doublet detection, modality integration, clustering, differential RNA and protein expression analysis, antibody-derived tag evaluation, ligand–receptor interaction analysis and interactive web-based visualization of CITE-seq data.
We demonstrate the capacity of CiteFuse to integrate the two data modalities and its relative advantage against data generated from single-modality profiling using both simulations and real-world CITE-seq data. Furthermore, we illustrate a novel doublet detection method based on a combined index of cell hashing and transcriptome data. Finally, we demonstrate CiteFuse for predicting ligand–receptor interactions by using multi-modal CITE-seq data. Collectively, we demonstrate the utility and effectiveness of CiteFuse for the integrative analysis of transcriptome and epitope profiles from CITE-seq data.
Short Abstract: Copy number alterations constitute important events in tumor evolution. Whole genome single cell sequencing data provides unprecedented insights into copy number profiles of individual cells, but is highly noisy. Reconstructing copy number event histories from this data is further complicated by the fact that the events span potentially overlapping regions of the genome.
Here, we propose Copy Number Event Tree (CONET), a probabilistic model for joint inference of the evolutionary tree on copy number events and copy number calling. CONET fully exploits the signal in scDNA-seq, as it relies directly on both the per-breakpoint and per-bin data. The model jointly infers the structure of an evolutionary tree on copy number events and copy number profiles of the cells, gaining statistical power in both tasks. CONET employs an efficient MCMC procedure to search the space of possible model structures and parameters. We introduce a range of model priors and penalties for efficient regularization. On simulated data, we demonstrate excellent performance of CONET both in breakpoint identification and tree reconstruction. CONET outperforms compared approaches in modeling the evolution and copy number calling for 260 cells from xenograft breast cancer sample.
CONET implementation is available at github.com/tc360950/CONET.
Short Abstract: Feature selection (marker gene selection) is widely believed to improve clustering accuracy, and is thus a key component of single cell clustering pipelines. However, we found that the performance of existing feature selection methods was inconsistent across benchmark datasets, and occasionally even worse than without feature selection. Moreover, existing methods ignored information contained in gene-gene correlations. We therefore developed DUBStepR (Determining the Underlying Basis using Stepwise Regression), a feature selection algorithm that leverages gene-gene correlations with a novel measure of inhomogeneity in feature space, termed the Density Index (DI). Despite selecting a relatively small number of genes, DUBStepR substantially outperformed existing single-cell feature selection methods across diverse clustering benchmarks. DUBStepR also demonstrated a significant improvement in additional benchmarking analyses focused on detection of rare cell types. DUBStepR is scalable to over a million cells, and can be straightforwardly applied to other data types such as single-cell ATAC-seq. We propose DUBStepR as a general-purpose feature selection solution for accurately clustering single-cell data.
Short Abstract: Single-cell transcriptomics lacks a cohesive mathematics of gene expression heterogeneity and cell type, limiting the interpretability of associated computational methods. We here provide such mathematics, developing an information-theoretic framework for quantifying the deviation of observed gene expression from a minimal ideal of cell type: that two cells of the same type should be statistically interchangeable. We showcase a novel unsupervised clustering algorithm based on our information-theoretic framework, which has no free parameters (beyond cluster number choice) and a natural correspondence to existing notions of cell type.
Short Abstract: The increasing popularity of spatial transcriptomics has allowed researchers to analyze transcriptome data in its tissue sample's spatial context. There have been many methods developed for detecting genes with distinct spatial expression patterns. However, the application of such genes in clustering cell types has not been thoroughly studied. In the typical analysis pipeline of current RNA-seq data, clustering analysis is done on highly variable genes. Spatially variable genes can be novel markers to identify cell type clusters. Therefore, combining highly variable genes and spatially variable genes could potentially improve clustering performance. We applied six different integration methods commonly used in sequencing data, including CIMLR, MOFA+, scVI, WNN, SNF, and naive concatenation to combine spatially variable and highly variable genes and extract useful features on a lower dimension. We applied shared Nearest Neighbor clustering to the reduced extracted features and evaluated their performance. We performed the first comprehensive benchmark study that evaluates integration tools for handling both genetic and spatial data, using both real datasets from different spatial transcriptomics technologies and simulated datasets. Our results show WNN is recommended for integrating spatially variable genes and highly variable genes and extracting useful information to improve clustering.
Short Abstract: Previously our team introduced Giotto, an R package for the analysis and visualization of single-cell spatial data. Giotto represents an easy-to-use toolbox agnostic of the spatial platform used and offers a range of innovative algorithms to characterize tissue composition, identify spatial expression patterns, and find cellular interactions. To further support our overarching philosophy to create a multi-functional tool we built Giotto suite, which is backwards compatible and incorporates a number of extensions and enhancements. It provides an improved framework to represent current and future spatial datasets. First, we designed data structures that capture cell morphology features (e.g. cell boundary) and that enable incorporation of individual transcript information at the subcellular level for one or more modalities. To further handle such large datasets we integrated an HDF5 representation and optimized our computations. Finally, to improve (inter)operability we created i) object converters between Giotto and other popular tools (e.g. Seurat, SpatialExperiment, etc.) ii) changed to a modular package format to promote contributions from external developers, and iii) developed a reactive object for interactive selection and analysis using the R/Shiny platform. Altogether, Giotto Suite represents a powerful toolbox ready to tackle the next generation of challenges in spatial data analysis and visualization.
Short Abstract: The application of single-cell omics technologies has resolved cell heterogeneity at increasing scale and resolution, leading to the identification of new subpopulations in both healthy and diseased states and across various tissues. Most recently, these technologies, have been exploited to generate several human atlases, in addition to organ- or disease-specific sub-atlases. However, comparison between these various atlases and extraction of fundamental commonalties is difficult owing to the plethora of sequencing protocols and analytical pipelines employed. Currently, due to the limitation of integration methods, most studies involved only a few dozen of samples, which may have inadequate power to identify rare cell types. Moreover, different studies adopted different metadata standards and cell type ontologies which makes it difficult to integrate and compare them. As a result, there is an unmet need for unified analysis, integration and annotation of public datasets in order to reveal synergy between studies and to avoid duplication of effort and irregularities of nomenclature. Here, we re-analysed all available datasets with unified pipeline and integrate them to build a comprehensive Human Single-Cell Reference Map(HSCRM), in an attempt to elucidate system-wide characteristics of human cells in different biological contexts.
Short Abstract: The placenta consists of cells of maternal and fetal origin. Single cell RNA-seq (sc-RNAseq) presents unprecedented opportunities to characterize interactions between maternal and fetal cells in driving human labor. An important first step in analyzing this data is to identify the origins of each cell. We present plaSCenta, an algorithm that integrates genomic and transcriptomic data in a supervised learning framework to infer the maternal or fetal origin of cells. plaSCenta uses Y-chromosomal reads in samples from male pregnancies to identify fetal cells and assign labels (maternal/fetal) to cells. It then uses these labeled samples to train machine learning algorithms to predict the origin based on gene expression. Utilizing single-cell data from samples representing different tissues and states of labor, we observe that plaSCenta accurately predicts the maternal/fetal origin of cells with an AUC score greater than 0.9. We find that plaSCenta outperforms freemuxlet, a general purpose origin inference algorithm that uses genomic variants. We investigate plaSCenta’s predictions across cell types and placental locations and find the maternal/fetal breakdown concordant with established biological knowledge. plaSCenta is implemented as a dashboard to visualize the breakdown of origin across placental locations, cell types, and maternal fetal identification methods.
Short Abstract: Metabolism plays a central role in shaping immune cell function. Within the tumour microenvironment, immune cells are in dynamic interaction with multiple different cell types, shifting the balance of tumour progression and anti-tumour immunity. This interplay is influenced by resource availability as well as cell-type specific metabolic states. We have developed a high-throughput flow cytometry method named Met-Flow, that measures key metabolic protein levels in a heterogeneous population, and at the single-cell level. This integrative proteomic analysis combines immune cell surface markers, cellular signalling by phosphorylation and cytokine production, which is directly linked to divergent metabolic states. Using Fast Fourier Transform-accelerated interpolation-based t-distributed stochastic neighbour embedding (FitSNE), we demonstrated metabolic remodelling of immune populations at single-cell resolution. By multiplexing metabolism to immune function, we identified unique metabolic states of memory T cell subsets under glycolytic perturbation. Measuring the capacity for flux through metabolic pathways using Met-Flow can be applied to any given cell type, and opens up the opportunity to target specific pathways across diseases.
Short Abstract: Cells continuously interact with each other to provide signaling cues to nearby cells during development. However, our understanding about the role of cell contact is still very limited. Approaches to study cell-cell interaction from single cell RNAseq (scRNAseq) relies heavily on co-expression of ligand-receptor pairs.
To understand cell-cell interactions beyond ligand-receptor can explain, we used RNA sequencing from physically interacting cells (PIC-seq) as well scRNAseq from developing mouse embryo (at embryonic day E7.5, E8.5, and E9.5). PIC-seq was generated by mildly dissociating mouse embryo. Our computational framework based on support vector machine (SVM) found that cells exhibit specific gene expression when they are contacting with other types of cells. For instance, Lhx5 is expressed in neural progenitor (NP) cells neighboring with definitive endoderm (DE), while Gsc is expressed in the DE cells interacting with NP. Based on it, we were able to predict the neighboring cell type by reading the transcriptome of a cell. We further developed spatial-tSNE to show the spatial organization of tissue in the 2 dimensional space based on the cell neighbor prediction. Our study reveals niche-specific gene expression and explains the cell heterogeneity in the scRNAseq.
Short Abstract: To investigate molecular mechanisms underlying cell state changes, a crucial analysis is to identify differentially expressed (DE) genes along the pseudotime inferred from single-cell RNA-sequencing data. However, existing methods do not account for pseudotime inference uncertainty, and they have either ill-posed p-values or restrictive models. Here we propose PseudotimeDE, a DE gene identification method that adapts to various pseudotime inference methods, accounts for pseudotime inference uncertainty, and outputs well-calibrated p-values. Comprehensive simulations and real-data applications verify that PseudotimeDE outperforms existing methods in false discovery rate control and power.
Short Abstract: Recent advances in spatially resolved transcriptomics provide spatial context for tissue domains, cell types, and their underlying functions. However, intrinsic tissue architectures often cannot be fully revealed using existing cell clustering methods due to a lack of strong representation for the biological context in tissues. We introduced RESEPT (REconstructing and Segmenting Expression mapped pseudo-RGB images based on sPatially resolved Transcriptomics), a novel framework to accurately characterize spatial heterogeneity from spatially resolved transcriptomics. Learning a spatial constrained graph neural network, RESEPT first converts spatial gene expression to a pseudo-RGB image for single-cell data visualization and interpretation. Then a CNN-based supervised segmentation model is trained on these pseudo images to reveal spatial contexts. A benchmark comparison on 16 human brain datasets of 10x Visium spatial transcriptomics shows that RESEPT outperforms other state-of-the-art methods in different normalization settings. On an in-house data of Alzheimer's Disease (AD) brain, RESEPT contributed significantly to identifying layers 2 and 3 of human postmortem middle temporal gyrus (MTG) and understanding the mechanism of the cellular and regional vulnerability of MTG in early AD. RESEPT was also applied to glioblastoma tumor in our study, which contributes to identifying infiltrating tumor and provides a potential clinical usage on prognosis.
Short Abstract: A pressing challenge in single-cell transcriptomics is to benchmark experimental protocols and computational methods. A solution is to use computational simulators, but existing simulators cannot simultaneously achieve three goals: preserving genes, capturing gene correlations, and generating any number of cells with varying sequencing depths. To fill this gap, we propose scDesign2, a transparent simulator that achieves all three goals and generates high-fidelity synthetic data for multiple single-cell gene expression count-based technologies. In particular, scDesign2 is advantageous in its transparent use of probabilistic models and its ability to capture gene correlations via copulas.
Short Abstract: Single-cell multi-omics data continues to grow at an unprecedented pace, and effectively integrating different modalities holds the promise for better characterization of cell identities. Although a number of methods have demonstrated promising results in integrating multiple modalities from the same tissue, the complexity and scale of data compositions typically present in cell atlases still pose a significant challenge for existing methods. Here we present scJoint, a transfer learning method to integrate atlas-scale, heterogeneous collections of scRNA-seq and scATAC-seq data. scJoint leverages information from annotated scRNA-seq data in a semi-supervised framework and uses a neural network to simultaneously train labeled and unlabeled data, enabling label transfer and joint visualization in an integrative framework. Using multiple atlas data and a biologically varying multi-modal data, we demonstrate scJoint is computationally efficient and consistently achieves significantly higher cell type label accuracy than existing methods while providing meaningful joint visualizations. This suggests scJoint is effective in overcoming the heterogeneity in different modalities towards a more comprehensive understanding of cellular phenotypes.
Short Abstract: Understanding intra-tumor heterogeneity is the cornerstone of developing effective cancer treatment and precision medicine. The development of single-cell DNA sequencing technology remarkably increases the resolution of DNA profiles to single-cell level. This facilitates the inference of phylogenetic trees with individual tumor cells as leaves, providing an evolutionary model of the mechanism behind intra-tumor heterogeneity. However, most of the methods proposed for tree reconstruction from single-cell data are based on infinite-sites assumption, which is often violated in reality due to evolutionary events like loss of heterozygosity.
Here, we develop a novel computational model, called Sieve, to jointly infer tumor phylogeny and call variants under finite-sites assumption from single-cell data. We propose a novel rate matrix, with states representing genotypes corresponding to heterozygous and homozygous mutations. To properly integrate the noisy single-cell sequencing data, we develop a Dirichlet-Multinomial based probabilistic model of the sequencing coverage and nucleotide read counts. The model accounts for allelic dropouts. To acquire accurate branch lengths, acquisition bias correction is applied. We prove that Sieve outperforms existing approaches on simulated data, especially regarding branch lengths and calling homozygous mutations. Sieve is then applied to publicly available real datasets. Sieve is implemented as a package of Beast 2.
Short Abstract: Advances in single-cell technologies enable the routine interrogation of chromatin accessibility
for tens of thousands of single cells, elucidating gene regulatory processes at an unprecedented resolution. Meanwhile, size, sparsity and high dimensionality of the resulting data continue to pose challenges for its computational analysis, and specifically the integration of data from different sources.
We have developed a dedicated computational approach, a variational auto-encoder using a noise model specifically designed for single-cell ATAC-seq data, which facilitates simultaneous dimensionality reduction and batch correction via an adversarial learning strategy. We showcase its benefits for detailed cell type characterization on individual real and simulated data sets as well as for integrating multiple complex datasets.
Short Abstract: Renal fibrosis is the hallmark of chronic kidney disease and incompletely understood. The kidney is a complex organ with a large heterogeneity of cell type such as glomerular cells, tubule-epithelial cells, mesenchymal cells, neuronal cells, endothelial cells and immune cells. The vascular niche is of key importance in kidney fibrosis since it harbours fibrosis-driving cells and capillary loss is thought to be one of the main drivers of disease progression. To understand mechanisms of fibrosis and capillary loss, we performed single cell RNA-sequencing and fate tracing of the major cell types of the vascular / perivascular niche including Gli1, Ng2, Myh11, Pdgfrb and Cd31+ cells in transgenic mice subjected to Unilateral Ureteral Obstruction (UUO) versus sham. Ten different 10x single cell datasets generated from genetically tagged sorted cells. Datasets were integrated for studying cell type-wise functional activity such as pathway activity, intercellular communication and cell differentiation. Several meaningful biological stories were identified supported by prior studies, including Pdgfa-Pdgfrb interactions with JAK-STAT pathway. As well, about 170 genes across data sets were found as driver candidate genes for the cell differentiation in renal fibrosis. Literature studies proved that 40 genes of them are driver genes or fibrosis-related genes.
Short Abstract: Imaging-based spatial transcriptomics has the power to reveal patterns of single-cell gene expression by detecting mRNA transcripts as individually resolved spots in multiplexed images. However, molecular quantification has been severely limited by the computational challenges of segmenting poorly outlined, overlapping cells, and of overcoming technical noise; the majority of transcripts are routinely discarded because they fall outside the segmentation boundaries. This lost information leads to less accurate gene count matrices and weakens downstream analyses, such as cell type or gene program identification.
Here, we present Sparcle, a probabilistic model that reassigns transcripts to cells based on gene covariation patterns and incorporates spatial features such as distance to nucleus. We demonstrate its utility on multiplexed error-robust fluorescence in situ hybridization (MERFISH), single-molecule FISH (smFISH) data, and spatially-resolved transcript amplicon readout mapping (STARmap).
Sparcle improves transcript assignment, providing more realistic per-cell quantification of each gene, better delineation of cell boundaries, and improved cluster assignments. Critically, our approach does not require an accurate segmentation and is agnostic to technological platform.
Short Abstract: The fast development of single-cell transcriptome sequencing has advanced our knowledge of embryonic development. However, the spatial context is lost during cell dissociation, which is important for understanding tissue formation and cell-cell interaction. Here, we applied Slide-seqV2 on multiple mouse embryo sections in different stages to build a spatial map of mouse early organogenesis with near-cellular resolution. We performed joint clustering of spatial location and gene expression patterns at different scales. By comparing with the single-cell transcriptome atlas we showed our approach not just accurately characterized cell types, but also resolved transcriptionally similar cell states with spatial patterns. We found known and novel genes with transcriptional biases along the neural crest and neural tube axis. And we further investigated the role of Tbx6 in specification of paraxial mesoderm, revealed the impact of Tbx6 knock out on neural tube patterning. In sum, we spatially mapped the emerging cell states in different scales from whole embryo to specific regions in mouse organogenesis.
Short Abstract: Spatial omics data are advancing the study of tissue organization and cellular communication at an unprecedented scale. For this purpose we developed “Spatial Quantification of Molecular Data in Python” (Squidpy), a python-based framework for the analysis of spatially-resolved omics data. Squidpy aims to bring the diversity of spatial data in a common data representation and provide a common set of analysis and interactive visualization tools. Such infrastructure is useful in a variety of analysis settings, for different data types, and it explicitly leverages the additional information that spatial data provides: the spatial coordinates and, when available, the tissue image. Squidpy is built on top of Scanpy and Anndata, and it relies on several scientific computing libraries in Python, such as Scikit-image and Napari. Its modularity makes it suitable to be interfaced with a variety of additional tools in the python data science and machine learning ecosystem, as well as several single-cell data analysis packages. It allows to quickly explore spatial datasets and lays the foundations for both spatial omics data analysis as well as novel methods development.
Short Abstract: With the advent of single-cell RNA-sequencing, researchers now have the ability to define cell types from large amounts of transcriptome information. Over the years, various clustering algorithms have been designed. Currently, most clustering algorithms measure cell-to-cell similarities using distance metrics based on the assumption that each cluster is comprised of “nearby” neighbors. In effect, clusters are a collection of similar cells in the embedded metric. Here, we propose that biological cluster should be comprised of sets of cells that satisfy a set of stochiometric constraints, whose intersections define a cell type. We propose to model each cell population with a single affine subspace, where all cells of the same type share a common set of linear constraints. We present an algorithm that leverages this subspace structure and learns a cell-to-cell affinity matrix based on notions of subspace similarity. We simulate scRNA-seq data according to the subspace model and benchmark our algorithm against pre-existing methods. We further test our algorithm on an in-house C. elegans dataset and show recovery of information on both cell type and developmental time. Lastly, we demonstrate how the subspace model allows us to compactly recover the major genes involved in an organism’s development.